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DETAILED ACTION 

1 . This Office Action is in response to correspondence filed May 24, 2007 in 
reference to application 10/584,360. Claims 1-81 are pending and have been 
examined. The application is a 371 of PCT/JP04/19426 filed December 24, 2004. 

Information Disclosure Statement 

2. The Information Disclosure Statements filed October 9, 2007, April 30, 2010, and 
July 30, 2010 have been accepted and considered in this office action. 

Claim Rejections - 35 USC §112 

3. The following is a quotation of the second paragraph of 35 U.S.C. 112: 

The specification shall conclude with one or more claims particularly pointing out and distinctly 
claiming the subject matter which the applicant regards as his invention. 

4. Claims 70-81 are rejected under 35 U.S.C. 1 1 2, second paragraph, as being 
indefinite for failing to particularly point out and distinctly claim the subject matter which 
applicant regards as the invention. Claims 70-81 are directed towards "programs which 
allow a computer to function." However the bodies of each claim recite various "means 
for" limitations. The use of "means for" imparts the functional description of each 
limitation that as found in the specification. In the instant case, the specification 
describes non-volatile memory and a CPU which execute the program to implement 
each limitation (see specification page 29 lines 1-7). Thus the body of the claims, 
because each limitation uses "means for," claim the "means for" described in the 
specification, which include hardware elements. The claims therefore consist of 
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programs which contain hardware elements, which cannot exist. Thus claims 70-81 are 
rejected as being indefinite. 



Claim Rejections - 35 USC § 101 

5. 35 U.S.C. 101 reads as follows: 

Whoever invents or discovers any new and useful process, machine, manufacture, or composition of 
matter, or any new and useful improvement thereof, may obtain a patent therefor, subject to the 
conditions and requirements of this title. 

6. Claim(s) 58-69 are rejected under 35 USC 1 01 as not falling within one of the 
four statutory categories of invention. While the claim(s) recite a series of steps or acts 
to be performed, a statutory "process" under 35 USC 101 must (1) be tied to another 
statutory category (such as a manufacture or a machine), or (2) transform underlying 
subject matter (such as an article or material) to a different state or thing. The instant 
claim(s) neither transform underlying subject matter (i.e., the claim does not include any 
type of physical transformation, only a manipulation of speech data. Manipulations of 
data are not physical transformations) nor positively recite structure associated with 
another statutory category (i.e., the claimed process does not rely on any type of 
physical hardware and could be performed by a human. For example, the human could 
recognize speech by listening to somebody talk and understanding, specify the context 
by determining what command was said, and specify a process by determining what 
action is associated with the command. Furthermore a user could perform the specified 
command, such as turning up the volume on the radio, or manually selected navigation 
functions. Although claims 58, 64, and 68 recite performing the control on the device, it 
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is noted that the device is ancillary to the invention. A user could perform the command 
by manipulating controls on the device, thus the user is performing the control, not the 
device. Likewise, claims 59, 61, 63, 65, 67 and 69 recite a "predetermined 
communication device." This too is ancillary to the claimed invention, as the 
communication device is used for the mere retrieval of data to be manipulated. Further, 
the data retrieval step can still be performed by a human by for example, looking 
information up in a book, which is a communication device in a broad sense.), and 
therefore do not define a statutory process. 

7. Claims 70-81 are rejected under 35 U.S.C. 101 because the claimed invention is 
directed to non-statutory subject matter. Claims 70-81 are directed towards various 
"programs which allow a computer to function." Thus these claims are mere software or 
computer code, which have been held to be non-statutory under 35 U.S.C. 101 . 
Therefore claims 70-81 are rejected as being non-statutory. 

8. Note that claims 1-57 are NOT rejected as being non-statutory because they 
recite "means for" in their limitations. The use of "means for" imparts the functional 
description of each limitation that as found in the specification. In the instant case, the 
specification describes non-volatile memory and a CPU which execute the program to 
implement each limitation (see specification page 29 lines 1-7). Thus the claims recited 
hardware and are statutory. 
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Claim Rejections - 35 USC § 103 

9. The following is a quotation of 35 U.S.C. 103(a) which forms the basis for all 
obviousness rejections set forth in this Office action: 

(a) A patent may not be obtained though the invention is not identically disclosed or described as set 
forth in section 102 of this title, if the differences between the subject matter sought to be patented and 
the prior art are such that the subject matter as a whole would have been obvious at the time the 
invention was made to a person having ordinary skill in the art to which said subject matter pertains. 
Patentability shall not be negatived by the manner in which the invention was made. 

1 0. Claims 1,4-11,1 3-20, 23-31 , 33-40, 42-49, and 51 -81 are rejected under 35 
U.S.C. 103(a) as being unpatentable over Funk et al. (US PAP 2003/0065427) in view 
of Kennewick et al. (US PAP 2004/0193420). 

1 1 . Consider claim 1 , Funk teaches a device control device (abstract) comprising: 
speech recognition means which acquires speech data representing a speech 

and specifies words represented by the speech by performing speech recognition on 
the speech data ( paragraph 0018, speech recognition is provided for command and 
control); and 

process execution means which specifies a content of control to be performed on 
an external device to be a control target based on the specified content, and performs 
the control (paragraphs 0019-0020, verbal command keywords result is the mobile unit 
performing different operations, such as retrieving information or controlling radio 
functions). 

Funk does not specifically teach specifying means which specifies a content of 
the speech uttered by an utterer based on the words specified by the speech 
recognition means. 
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In the same field of speech control, Kennewick teaches specifying means which 
specifies a content of the speech uttered by an utterer based on the words specified by 
the speech recognition means (paragraphs 0160-0161 , speech tokens are passed to a 
parser to determine the context and domain of a user's command. Domain and context 
correspond to what the user is actually asking for.). 

Therefore it would have been obvious to one of ordinary skill in the art at the time 
of the invention use the speech parser as taught by Kennewick at the output of the 
speech recognition in the system of Funk in order to allow the system to better 
understand natural language queries, and reduce the need for the use of keywords 
(Kennewick 0006-0007). 

12. Consider claim 4, Funk and Kennewick teach the device control device according 
to claim 1 . Kennewick further teaches that the specifying means holds information which 
associates words with one or more categories, and specifies a content of the speech 
uttered by the utterer based on a category in which the words specified by the speech 
recognition means are classified (the system associates keywords with contexts, using 
keyword matching; 0160. In this example, the word "temperature" is associated with 
two different contexts of "weather" and "measurement." Parser uses prior probability, 
which must be stored to be used, to determine the proper context.). 

13. Consider claim 5, Funk and Kennewick teach the device control device according 
to claim 1 . Kennewick further teaches the specifying means holds correlation 
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information which associates words of different meanings or different categories with 
each process of the process execution means, and specifies a content of the speech 
uttered by the utterer based on a combination of those words or categories which are 
specified by the speech recognition means and the correlation information (the system 
associates keywords with contexts, using keyword matching; 0160. In this example, the 
word "temperature" is associated with two different contexts of "weather" and 
"measurement." Parser uses prior probability, which must be stored to be used, to 
determine the proper context. This probability information represents how probable it is 
that word is associated with a context, and thus represents the correlation information 
between the word and the context.). 

14. Consider claim 6, Funk and Kennewick teach the device control device according 
to claim 1 . Kennewick further teaches wherein the specifying means holds information 
which associates words with one or more categories, and specifies a content of the 
speech uttered by the utterer based on a category in which a plurality of words specified 
by the speech recognition means are commonly classified (the system associates 
keywords with contexts, using keyword matching; 0160. In this example, the word 
"temperature" is associated with two different contexts of "weather" and "measurement." 
Parser uses prior probability, which must be stored to be used, to determine the proper 
context. This probability represents how commonly the word is associated with a 
context). 
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15. Consider claim 7, Funk and Kennewick teach the device control device according 
to claim 1 , wherein the specifying means holds a plurality of words assigned to 
respective processes of the process execution means (Funk, 0019-0020, keywords are 
associated with various commands), and performs a corresponding process when at 
least one of the words specified by the speech recognition means is a word assigned to 
the process (Funk, 00190-0020, commands are executed based on received 
keywords.). 

16. Consider claim 8, the current combination of Funk and Kennewick teach the 
device control device according to claim 1, but does not specifically teach when a 
meaning of an input speech is not discriminatable, the specifying means prompts an 
input in a more discriminatable expression. 

However, Kennewick further teaches teach when a meaning of an input speech 
is not discriminatable, the specifying means prompts an input in a more discriminatable 
expression (00161, system can question the user to verify question and allow them to 
rephrase to remove ambiguity.). 

Therefore it would have been obvious to one of ordinary skill in the art at the time 
of the invention to allow a user to clarify a question or a command as further taught by 
Kennewick, in the system of Funk and Kennewick in order to allow the system to insure 
an accurate response when the confidence level of a correct understanding is not high 
(Kennewick 0161). 
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17. Consider claim 9, Funk and Kennewick teaches the device control device 
according to claim 1, further comprising information acquisition means which acquires 
information from an external device (Funk, 0019 and 0026, system may access 
information such as stock reports and weather from a voice portal server), and wherein 
the specifying means selects an output content to be output based on the information 
acquired by the information acquisition means (Funk, 0019, keyword command may 
also be used to access information from information accessing device either through 
text or audible format.). 

18. Consider claim 10, Funk teaches a device control device (abstract) comprising: 
speech recognition means which acquires speech data representing a speech 

and specifies words represented by the speech by performing speech recognition on 
the speech data (paragraph 0018, speech is acquired and applied to speech recognition 
is provided for command and control); 

process specifying means which specifies a content of control to be performed 
on an external device to be a control target based on the specified content (paragraphs 
0019-0020, verbal command keywords result is the mobile unit performing different 
operations, such as retrieving information or controlling radio functions); 

information acquisition means which acquires information via predetermined 
communication means (0019 and 0026, and figure 2, system may access information 
such as stock reports and weather from a voice portal server via a call); and 
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speech output means which outputs a speech based on the information acquired 
by the information acquisition means (0019, stock and weather information may be 
output to the driver in an audible format, which would be speech), 

whereby when the control specified by the process specifying means is to output 
information acquired by the information acquisition means, the speech output means 
outputs a speech based on the information (, 0019, keyword command may be used to 
access information from information accessing device either through text or audible 
format. 0031 provides an example of the spoken dialog). 

Funk does not specifically teach specifying means which specifies a content of 
the speech uttered by an utterer based on the words specified by the speech 
recognition means. 

In the same field of speech control, Kennewick teaches specifying means which 
specifies a content of the speech uttered by an utterer based on the words specified by 
the speech recognition means (paragraphs 0160-0161, speech tokens are passed to a 
parser to determine the context and domain of a user's command. Domain and context 
correspond to what the user is actually asking for.). 

Therefore it would have been obvious to one of ordinary skill in the art at the time 
of the invention use the speech parser as taught by Kennewick at the output of the 
speech recognition in the system of Funk in order to allow the system to better 
understand natural language queries, and reduce the need for the use of keywords 
(Kennewick 0006-0007). 



Application/Control Number: 10/584,360 Page 1 1 

Art Unit: 2626 

19. Consider claim 1 1 , Funk teaches a speech recognition device (abstract) 
comprising: 

speech recognition means which acquires speech data representing a speech 
and specifies words represented by the speech by performing speech recognition on 
the speech data ( paragraph 0018, speech recognition is provided for command and 
control); and 

process execution means which specifies a process to be performed based on 
the specified content, and performs the process (paragraphs 0019-0020, verbal 
command keywords result is the mobile unit performing different operations, such as 
retrieving information or controlling radio functions). 

Funk does not specifically teach specifying means which specifies a content of 
the speech uttered by an utterer based on the words specified by the speech 
recognition means. 

In the same field of speech control, Kennewick teaches specifying means which 
specifies a content of the speech uttered by an utterer based on the words specified by 
the speech recognition means (paragraphs 0160-0161, speech tokens are passed to a 
parser to determine the context and domain of a user's command. Domain and context 
correspond to what the user is actually asking for.). 

Therefore it would have been obvious to one of ordinary skill in the art at the time 
of the invention use the speech parser as taught by Kennewick at the output of the 
speech recognition in the system of Funk in order to allow the system to better 
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understand natural language queries, and reduce the need for the use of keywords 
(Kennewick 0006-0007). 

20. Consider claim 13, Funk and Kennewick teach the speech recognition device 
according to claim 1 1 . Kennewick further teaches that the specifying means holds 
information which associates words with one or more categories, and specifies a 
content of the speech uttered by the utterer based on a category in which the words 
specified by the speech recognition means are classified (the system associates 
keywords with contexts, using keyword matching; 0160. In this example, the word 
"temperature" is associated with two different contexts of "weather" and "measurement." 
Parser uses prior probability, which must be stored to be used, to determine the proper 
context.). 

21 . Consider claim 14, Funk and Kennewick teach the speech recognition device 
according to claim 1 1 . Kennewick further teaches the specifying means holds 
correlation information which associates words of different meanings or different 
categories with each process of the process execution means, and specifies a content 
of the speech uttered by the utterer based on a combination of those words or 
categories which are specified by the speech recognition means and the correlation 
information (the system associates keywords with contexts, using keyword matching; 
0160. In this example, the word "temperature" is associated with two different contexts 
of "weather" and "measurement." Parser uses prior probability, which must be stored to 
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be used, to determine the proper context. This probability information represents how 
probable it is that word is associated with a context, and thus represents the correlation 
information between the word and the context.). 

22. Consider claim 15, Funk and Kennewick teach the speech recognition device 
according to claim 1 1 . Kennewick further teaches wherein the specifying means holds 
information which associates words with one or more categories, and specifies a 
content of the speech uttered by the utterer based on a category in which a plurality of 
words specified by the speech recognition means are commonly classified (the system 
associates keywords with contexts, using keyword matching; 0160. In this example, the 
word "temperature" is associated with two different contexts of "weather" and 
"measurement." Parser uses prior probability, which must be stored to be used, to 
determine the proper context. This probability represents how commonly the word is 
associated with a context). 

23. Consider claim 16, Funk and Kennewick teach the speech recognition device 
according to claim 1 1 , wherein the specifying means holds a plurality of words assigned 
to respective processes of the process execution means (Funk, 0019-0020, keywords 
are associated with various commands), and performs a corresponding process when 
at least one of the words specified by the speech recognition means is a word assigned 
to the process (Funk, 00190-0020, commands are executed based on received 
keywords.). 
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24. Consider claim 17, the current combination of Funk and Kennewick teach the 
speech recognition device according to claim 11, but does not specifically teach when a 
meaning of an input speech is not discriminatable, the specifying means prompts an 
input in a more discriminatable expression. 

However, Kennewick further teaches teach when a meaning of an input speech 
is not discriminatable, the specifying means prompts an input in a more discriminatable 
expression (00161, system can question the user to verify question and allow them to 
rephrase to remove ambiguity.). 

Therefore it would have been obvious to one of ordinary skill in the art at the time 
of the invention to allow a user to clarify a question or a command as further taught by 
Kennewick, in the system of Funk and Kennewick in order to allow the system to insure 
an accurate response when the confidence level of a correct understanding is not high 
(Kennewick 0161). 

25. Consider claim 18, Funk and Kennewick teaches the speech recognition device 
according to claim 1 1 , further comprising information acquisition means which acquires 
information from an external device (Funk, 0019 and 0026, system may access 
information such as stock reports and weather from a voice portal server), and wherein 
the specifying means selects an output content to be output based on the information 
acquired by the information acquisition means (Funk, 0019, keyword command may 
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also be used to access information from information accessing device either through 
text or audible format.). 

26. Consider claim 19, Funk teaches a speech recognition device (abstract) 
comprising: 

speech recognition means which acquires speech data representing a speech 
and specifies words represented by the speech by performing speech recognition on 
the speech data (paragraph 0018, speech is acquired and applied to speech recognition 
is provided for command and control); 

process specifying means which specifies a process to be performed based on 
the specified content (paragraphs 0019-0020, verbal command keywords result is the 
mobile unit performing different operations, such as retrieving information or controlling 
radio functions); 

information acquisition means which acquires information via predetermined 
communication means (0019 and 0026, and figure 2, system may access information 
such as stock reports and weather from a voice portal server via a call); and 

speech output means which outputs a speech based on the information acquired 
by the information acquisition means (0019, stock and weather information may be 
output to the driver in an audible format, which would be speech), 

whereby when the control specified by the process specifying means is to output 
information acquired by the information acquisition means, the speech output means 
outputs a speech based on the information (, 0019, keyword command may be used to 
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access information from information accessing device either through text or audible 
format. 0031 provides an example of the spoken dialog). 

Funk does not specifically teach specifying means which specifies a content of 
the speech uttered by an utterer based on the words specified by the speech 
recognition means. 

In the same field of speech control, Kennewick teaches specifying means which 
specifies a content of the speech uttered by an utterer based on the words specified by 
the speech recognition means (paragraphs 0160-0161, speech tokens are passed to a 
parser to determine the context and domain of a user's command. Domain and context 
correspond to what the user is actually asking for.). 

Therefore it would have been obvious to one of ordinary skill in the art at the time 
of the invention use the speech parser as taught by Kennewick at the output of the 
speech recognition in the system of Funk in order to allow the system to better 
understand natural language queries, and reduce the need for the use of keywords 
(Kennewick 0006-0007). 

27. Consider claim 20, Funk teaches an agent device (abstract, provides data to 
user) comprising: 

speech recognition means which acquires speech data representing a speech 
and specifies words represented by the speech by performing speech recognition on 
the speech data ( paragraph 0018, speech recognition is provided for command and 
control); and 
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process execution means which specifies a process to be performed based on 
the specified content, and performs the process (paragraphs 0019-0020, verbal 
command keywords result is the mobile unit performing different operations, such as 
retrieving information or controlling radio functions). 

Funk does not specifically teach specifying means which specifies a content of 
the speech uttered by an utterer based on the words specified by the speech 
recognition means. 

In the same field of speech control, Kennewick teaches specifying means which 
specifies a content of the speech uttered by an utterer based on the words specified by 
the speech recognition means (paragraphs 0160-0161, speech tokens are passed to a 
parser to determine the context and domain of a user's command. Domain and context 
correspond to what the user is actually asking for.). 

Therefore it would have been obvious to one of ordinary skill in the art at the time 
of the invention use the speech parser as taught by Kennewick at the output of the 
speech recognition in the system of Funk in order to allow the system to better 
understand natural language queries, and reduce the need for the use of keywords 
(Kennewick 0006-0007). 

28. Consider claim 23, Funk and Kennewick teach the agent device according to 
claim 20. Kennewick further teaches that the specifying means holds information which 
associates words with one or more categories, and specifies a content of the speech 
uttered by the utterer based on a category in which the words specified by the speech 
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recognition means are classified (the system associates keywords with contexts, using 
keyword matching; 0160. In this example, the word "temperature" is associated with 
two different contexts of "weather" and "measurement." Parser uses prior probability, 
which must be stored to be used, to determine the proper context.). 

29. Consider claim 24, Funk and Kennewick teach the agent device according to 
claim 20. Kennewick further teaches the specifying means holds correlation information 
which associates words of different meanings or different categories with each process 
of the process execution means, and specifies a content of the speech uttered by the 
utterer based on a combination of those words or categories which are specified by the 
speech recognition means and the correlation information (the system associates 
keywords with contexts, using keyword matching; 0160. In this example, the word 
"temperature" is associated with two different contexts of "weather" and "measurement." 
Parser uses prior probability, which must be stored to be used, to determine the proper 
context. This probability information represents how probable it is that word is 
associated with a context, and thus represents the correlation information between the 
word and the context.). 

30. Consider claim 25, Funk and Kennewick teach the agent device according to 
claim 20. Kennewick further teaches wherein the specifying means holds information 
which associates words with one or more categories, and specifies a content of the 
speech uttered by the utterer based on a category in which a plurality of words specified 
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by the speech recognition means are commonly classified (the system associates 
keywords with contexts, using keyword matching; 0160. In this example, the word 
"temperature" is associated with two different contexts of "weather" and "measurement." 
Parser uses prior probability, which must be stored to be used, to determine the proper 
context. This probability represents how commonly the word is associated with a 
context). 

31 . Consider claim 26, Funk and Kennewick teach the agent device according to 
claim 20, wherein the specifying means holds a plurality of words assigned to respective 
processes of the process execution means (Funk, 0019-0020, keywords are associated 
with various commands), and performs a corresponding process when at least one of 
the words specified by the speech recognition means is a word assigned to the process 
(Funk, 00190-0020, commands are executed based on received keywords.). 

32. Consider claim 27, the current combination of Funk and Kennewick teach the 
agent device according to claim 20, but does not specifically teach when a meaning of 
an input speech is not discriminatable, the specifying means prompts an input in a more 
discriminatable expression. 

However, Kennewick further teaches teach when a meaning of an input speech 
is not discriminatable, the specifying means prompts an input in a more discriminatable 
expression (001 61 , system can question the user to verify question and allow them to 
rephrase to remove ambiguity.). 
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Therefore it would have been obvious to one of ordinary skill in the art at the time 
of the invention to allow a user to clarify a question or a command as further taught by 
Kennewick, in the system of Funk and Kennewick in order to allow the system to insure 
an accurate response when the confidence level of a correct understanding is not high 
(Kennewick 0161). 

Consider claim 28, Funk and Kennewick teaches the agent device according to 
claim 20, further comprising information acquisition means which acquires information 
from an external device (Funk, 0019 and 0026, system may access information such as 
stock reports and weather from a voice portal server), and wherein the specifying 
means selects an output content to be output based on the information acquired by the 
information acquisition means (Funk, 0019, keyword command may also be used to 
access information from information accessing device either through text or audible 
format.). 

33. Consider claim 29, Funk teaches the agent device according to claim 20, wherein 
the specifying means includes means which, when the process specified as a process 
to be performed is a process of presenting information externally received to the utterer, 
performs the presentation by generating a speech which reads out the information 
(0019, information may be read out to a user in an audible format. 0031 provides an 
example of the spoken dialog) 
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34. Consider claim 30, Funk teaches an agent device (abstract, provides data to 
user) comprising: 

speech recognition means which acquires speech data representing a speech 
and specifies words represented by the speech by performing speech recognition on 
the speech data (paragraph 0018, speech is acquired and applied to speech recognition 
is provided for command and control); 

process specifying means which specifies a process to be performed based on 
the specified content (paragraphs 0019-0020, verbal command keywords result is the 
mobile unit performing different operations, such as retrieving information or controlling 
radio functions); 

information acquisition means which acquires information via predetermined 
communication means (0019 and 0026, and figure 2, system may access information 
such as stock reports and weather from a voice portal server via a call); and 

speech output means which outputs a speech based on the information acquired 
by the information acquisition means (0019, stock and weather information may be 
output to the driver in an audible format, which would be speech), 

whereby when the control specified by the process specifying means is to output 
information acquired by the information acquisition means, the speech output means 
outputs a speech based on the information (, 0019, keyword command may be used to 
access information from information accessing device either through text or audible 
format. 0031 provides an example of the spoken dialog). 
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Funk does not specifically teach specifying means which specifies a content of 
the speech uttered by an utterer based on the words specified by the speech 
recognition means. 

In the same field of speech control, Kennewick teaches specifying means which 
specifies a content of the speech uttered by an utterer based on the words specified by 
the speech recognition means (paragraphs 0160-0161, speech tokens are passed to a 
parser to determine the context and domain of a user's command. Domain and context 
correspond to what the user is actually asking for.). 

Therefore it would have been obvious to one of ordinary skill in the art at the time 
of the invention use the speech parser as taught by Kennewick at the output of the 
speech recognition in the system of Funk in order to allow the system to better 
understand natural language queries, and reduce the need for the use of keywords 
(Kennewick 0006-0007). 

35. Consider claim 31 , Funk teaches an on-vehicle control device so constructed as 
to be mountable on a vehicle having an external on-vehicle device mounted thereon 
(abstract, on board device, figure 6, control screen 25) comprising: 

speech recognition means which acquires speech data representing a speech 
and specifies words represented by the speech by performing speech recognition on 
the speech data ( paragraph 0018, speech recognition is provided for command and 
control); and 
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process execution means which specifies a content of control to be performed on 
the on-vehicle device to be a control target based on the specified content, and 
performs the control (paragraphs 0019-0020, verbal command keywords result is the 
mobile unit performing different operations, such as retrieving information or controlling 
radio functions). 

Funk does not specifically teach specifying means which specifies a content of 
the speech uttered by an utterer based on the words specified by the speech 
recognition means. 

In the same field of speech control, Kennewick teaches specifying means which 
specifies a content of the speech uttered by an utterer based on the words specified by 
the speech recognition means (paragraphs 0160-0161 , speech tokens are passed to a 
parser to determine the context and domain of a user's command. Domain and context 
correspond to what the user is actually asking for.). 

Therefore it would have been obvious to one of ordinary skill in the art at the time 
of the invention use the speech parser as taught by Kennewick at the output of the 
speech recognition in the system of Funk in order to allow the system to better 
understand natural language queries, and reduce the need for the use of keywords 
(Kennewick 0006-0007). 

36. Consider claim 33, Funk and Kennewick teach the on-vehicle control device 
according to claim 31 . Kennewick further teaches that the specifying means holds 
information which associates words with one or more categories, and specifies a 
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content of the speech uttered by the utterer based on a category in which the words 
specified by the speech recognition means are classified (the system associates 
keywords with contexts, using keyword matching; 0160. In this example, the word 
"temperature" is associated with two different contexts of "weather" and "measurement." 
Parser uses prior probability, which must be stored to be used, to determine the proper 
context.). 

37. Consider claim 34, Funk and Kennewick teach the on-vehicle control device 
according to claim 31 . Kennewick further teaches the specifying means holds 
correlation information which associates words of different meanings or different 
categories with each process of the process execution means, and specifies a content 
of the speech uttered by the utterer based on a combination of those words or 
categories which are specified by the speech recognition means and the correlation 
information (the system associates keywords with contexts, using keyword matching; 
0160. In this example, the word "temperature" is associated with two different contexts 
of "weather" and "measurement." Parser uses prior probability, which must be stored to 
be used, to determine the proper context. This probability information represents how 
probable it is that word is associated with a context, and thus represents the correlation 
information between the word and the context.). 

38. Consider claim 35, Funk and Kennewick teach the on-vehicle control device 
according to claim 31 . Kennewick further teaches wherein the specifying means holds 
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information which associates words with one or more categories, and specifies a 
content of the speech uttered by the utterer based on a category in which a plurality of 
words specified by the speech recognition means are commonly classified (the system 
associates keywords with contexts, using keyword matching; 0160. In this example, the 
word "temperature" is associated with two different contexts of "weather" and 
"measurement." Parser uses prior probability, which must be stored to be used, to 
determine the proper context. This probability represents how commonly the word is 
associated with a context). 

39. Consider claim 36, Funk and Kennewick teach the on-vehicle control device 
according to claim 31, wherein the specifying means holds a plurality of words assigned 
to respective processes of the process execution means (Funk, 0019-0020, keywords 
are associated with various commands), and performs a corresponding process when 
at least one of the words specified by the speech recognition means is a word assigned 
to the process (Funk, 00190-0020, commands are executed based on received 
keywords.). 

40. Consider claim 37, the current combination of Funk and Kennewick teach the 
device control device according to claim 31 , but does not specifically teach when a 
meaning of an input speech is not discriminatable, the specifying means prompts an 
input in a more discriminatable expression. 
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However, Kennewick further teaches teach when a meaning of an input speech 
is not discriminatable, the specifying means prompts an input in a more discriminatable 
expression (00161, system can question the user to verify question and allow them to 
rephrase to remove ambiguity.). 

Therefore it would have been obvious to one of ordinary skill in the art at the time 
of the invention to allow a user to clarify a question or a command as further taught by 
Kennewick, in the system of Funk and Kennewick in order to allow the system to insure 
an accurate response when the confidence level of a correct understanding is not high 
(Kennewick 0161). 

41 . Consider claim 38, Funk and Kennewick teaches the on-vehicle control device 
according to claim 31 , further comprising information acquisition means which acquires 
information from an external device (Funk, 0019 and 0026, system may access 
information such as stock reports and weather from a voice portal server), and wherein 
the specifying means selects an output content to be output based on the information 
acquired by the information acquisition means (Funk, 0019, keyword command may 
also be used to access information from information accessing device either through 
text or audible format.). 

42. Consider claim 39, Funk teaches an on-vehicle control device so constructed as 
to be mountable on a vehicle having an external on-vehicle device mounted thereon 
(abstract, on board device, figure 6, control screen 25) comprising: 
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speech recognition means which acquires speech data representing a speech 
and specifies words represented by the speech by performing speech recognition on 
the speech data (paragraph 0018, speech is acquired and applied to speech recognition 
is provided for command and control); 

process specifying means which specifies a content of control to be performed 
on the on-vehicle device to be a control target based on the specified content 
(paragraphs 0019-0020, verbal command keywords result is the mobile unit performing 
different operations, such as retrieving information or controlling radio functions); 

information acquisition means which acquires information via predetermined 
communication means (0019 and 0026, and figure 2, system may access information 
such as stock reports and weather from a voice portal server via a call); and 

speech output means which outputs a speech based on the information acquired 
by the information acquisition means (0019, stock and weather information may be 
output to the driver in an audible format, which would be speech), 

whereby when the control specified by the process specifying means is to output 
information acquired by the information acquisition means, the speech output means 
outputs a speech based on the information (0019, keyword command may be used to 
access information from information accessing device either through text or audible 
format. 0031 provides an example of the spoken dialog). 

Funk does not specifically teach specifying means which specifies a content of 
the speech uttered by an utterer based on the words specified by the speech 
recognition means. 
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In the same field of speech control, Kennewick teaches specifying means which 
specifies a content of the speech uttered by an utterer based on the words specified by 
the speech recognition means (paragraphs 0160-0161 , speech tokens are passed to a 
parser to determine the context and domain of a user's command. Domain and context 
correspond to what the user is actually asking for.). 

Therefore it would have been obvious to one of ordinary skill in the art at the time 
of the invention use the speech parser as taught by Kennewick at the output of the 
speech recognition in the system of Funk in order to allow the system to better 
understand natural language queries, and reduce the need for the use of keywords 
(Kennewick 0006-0007). 

43. Consider claim 40, Funk teaches a navigation device so constructed to be 
mountable on a vehicle (abstract, figure 5 unit 25) comprising: 

speech recognition means which acquires speech data representing a speech 
and specifies words represented by the speech by performing speech recognition on 
the speech data ( paragraph 0018, speech recognition is provided for command and 
control); and 

process execution means which specifies a navigation process to be performed 
based on the specified content, and performs the navigation process (paragraphs 0019- 
0020, verbal command keywords result is the mobile unit performing different 
operations, navigation control functions). 
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Funk does not specifically teach specifying means which specifies a content of 
the speech uttered by an utterer based on the words specified by the speech 
recognition means. 

In the same field of speech control, Kennewick teaches specifying means which 
specifies a content of the speech uttered by an utterer based on the words specified by 
the speech recognition means (paragraphs 0160-0161, speech tokens are passed to a 
parser to determine the context and domain of a user's command. Domain and context 
correspond to what the user is actually asking for.). 

Therefore it would have been obvious to one of ordinary skill in the art at the time 
of the invention use the speech parser as taught by Kennewick at the output of the 
speech recognition in the system of Funk in order to allow the system to better 
understand natural language queries, and reduce the need for the use of keywords 
(Kennewick 0006-0007). 

44. Consider claim 42, Funk and Kennewick teach the navigation device according to 
claim 40. Kennewick further teaches that the specifying means holds information which 
associates words with one or more categories, and specifies a content of the speech 
uttered by the utterer based on a category in which the words specified by the speech 
recognition means are classified (the system associates keywords with contexts, using 
keyword matching; 0160. In this example, the word "temperature" is associated with 
two different contexts of "weather" and "measurement." Parser uses prior probability, 
which must be stored to be used, to determine the proper context.). 
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45. Consider claim 43, Funk and Kennewick teach the navigation device according to 
claim 40. Kennewick further teaches the specifying means holds correlation information 
which associates words of different meanings or different categories with each process 
of the process execution means, and specifies a content of the speech uttered by the 
utterer based on a combination of those words or categories which are specified by the 
speech recognition means and the correlation information (the system associates 
keywords with contexts, using keyword matching; 0160. In this example, the word 
"temperature" is associated with two different contexts of "weather" and "measurement." 
Parser uses prior probability, which must be stored to be used, to determine the proper 
context. This probability information represents how probable it is that word is 
associated with a context, and thus represents the correlation information between the 
word and the context.). 

46. Consider claim 44, Funk and Kennewick teach the navigation device according to 
claim 40. Kennewick further teaches wherein the specifying means holds information 
which associates words with one or more categories, and specifies a content of the 
speech uttered by the utterer based on a category in which a plurality of words specified 
by the speech recognition means are commonly classified (the system associates 
keywords with contexts, using keyword matching; 0160. In this example, the word 
"temperature" is associated with two different contexts of "weather" and "measurement." 
Parser uses prior probability, which must be stored to be used, to determine the proper 
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context. This probability represents how commonly the word is associated with a 
context). 

47. Consider claim 45, Funk and Kennewick teach the navigation device according to 
claim 40, wherein the specifying means holds a plurality of words assigned to respective 
processes of the process execution means (Funk, 0019-0020, keywords are associated 
with various commands), and performs a corresponding process when at least one of 
the words specified by the speech recognition means is a word assigned to the process 
(Funk, 00190-0020, commands are executed based on received keywords.). 

48. Consider claim 46, the current combination of Funk and Kennewick teach the 
navigation device according to claim 40, but does not specifically teach when a meaning 
of an input speech is not discriminatable, the specifying means prompts an input in a 
more discriminatable expression. 

However, Kennewick further teaches teach when a meaning of an input speech 
is not discriminatable, the specifying means prompts an input in a more discriminatable 
expression (00161, system can question the user to verify question and allow them to 
rephrase to remove ambiguity.). 

Therefore it would have been obvious to one of ordinary skill in the art at the time 
of the invention to allow a user to clarify a question or a command as further taught by 
Kennewick, in the system of Funk and Kennewick in order to allow the system to insure 
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an accurate response when the confidence level of a correct understanding is not high 
(Kennewick 0161). 

49. Consider claim 47, Funk and Kennewick teaches the navigation device according 
to claim 40, further comprising information acquisition means which acquires 
information from an external device (Funk, 0019 and 0026, system may access 
information such as stock reports and weather from a voice portal server), and wherein 
the specifying means selects an output content to be output based on the information 
acquired by the information acquisition means (Funk, 0019, keyword command may 
also be used to access information from information accessing device either through 
text or audible format.). 

50. Consider claim 48, Funk teaches a navigation device so constructed as to be 
mountable on a vehicle (abstract, on board device, figure 6, control screen 25) 
comprising: 

speech recognition means which acquires speech data representing a speech 
and specifies words represented by the speech by performing speech recognition on 
the speech data (paragraph 0018, speech is acquired and applied to speech recognition 
is provided for command and control); 

process specifying means which specifies a content of a navigation process to be 
performed based on the specified content (paragraphs 0019-0020, verbal command 
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keywords result is the mobile unit performing different operations, such as navigation 
functions); 

information acquisition means which acquires information via predetermined 
communication means (0019 and 0026, and figure 2, system may access information 
such as stock reports and weather from a voice portal server via a call); and 

speech output means which outputs a speech based on the information acquired 
by the information acquisition means (0019, stock and weather information may be 
output to the driver in an audible format, which would be speech), 

whereby when the control specified by the process specifying means is to output 
information acquired by the information acquisition means, the speech output means 
outputs a speech based on the information (0019, keyword command may be used to 
access information from information accessing device either through text or audible 
format. 0031 provides an example of the spoken dialog). 

Funk does not specifically teach specifying means which specifies a content of 
the speech uttered by an utterer based on the words specified by the speech 
recognition means. 

In the same field of speech control, Kennewick teaches specifying means which 
specifies a content of the speech uttered by an utterer based on the words specified by 
the speech recognition means (paragraphs 0160-0161, speech tokens are passed to a 
parser to determine the context and domain of a user's command. Domain and context 
correspond to what the user is actually asking for.). 
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Therefore it would have been obvious to one of ordinary skill in the art at the time 
of the invention use the speech parser as taught by Kennewick at the output of the 
speech recognition in the system of Funk in order to allow the system to better 
understand natural language queries, and reduce the need for the use of keywords 
(Kennewick 0006-0007). 

51 . Consider claim 49, Funk teaches an audio device (abstract, voice communicator) 
comprising: 

speech recognition means which acquires speech data representing a speech 
and specifies words represented by the speech by performing speech recognition on 
the speech data (paragraph 0018, speech recognition is provided for command and 
control); and 

process execution means which specifies a content of a speech process to be 
performed based on the specified content, and performs the speech process, or 
controls an external device in such a way as to cause the external device to perform the 
speech process (paragraphs 0019-0020, verbal command keywords result is the mobile 
unit performing different operations, including retrieving information, which may be 
presented in audible form through speech. Also see 0031 for an example of a speech 
dialog). 

Funk does not specifically teach specifying means which specifies a content of 
the speech uttered by an utterer based on the words specified by the speech 
recognition means. 
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In the same field of speech control, Kennewick teaches specifying means which 
specifies a content of the speech uttered by an utterer based on the words specified by 
the speech recognition means (paragraphs 0160-0161 , speech tokens are passed to a 
parser to determine the context and domain of a user's command. Domain and context 
correspond to what the user is actually asking for.). 

Therefore it would have been obvious to one of ordinary skill in the art at the time 
of the invention use the speech parser as taught by Kennewick at the output of the 
speech recognition in the system of Funk in order to allow the system to better 
understand natural language queries, and reduce the need for the use of keywords 
(Kennewick 0006-0007). 

52. Consider claim 51 , Funk and Kennewick teach the audio device according to 
claim 49. Kennewick further teaches that the specifying means holds information which 
associates words with one or more categories, and specifies a content of the speech 
uttered by the utterer based on a category in which the words specified by the speech 
recognition means are classified (the system associates keywords with contexts, using 
keyword matching; 0160. In this example, the word "temperature" is associated with 
two different contexts of "weather" and "measurement." Parser uses prior probability, 
which must be stored to be used, to determine the proper context.). 

53. Consider claim 52, Funk and Kennewick teach the audio device according to 
claim 49. Kennewick further teaches the specifying means holds correlation information 



Application/Control Number: 10/584,360 Page 36 

Art Unit: 2626 

which associates words of different meanings or different categories with each process 
of the process execution means, and specifies a content of the speech uttered by the 
utterer based on a combination of those words or categories which are specified by the 
speech recognition means and the correlation information (the system associates 
keywords with contexts, using keyword matching; 0160. In this example, the word 
"temperature" is associated with two different contexts of "weather" and "measurement." 
Parser uses prior probability, which must be stored to be used, to determine the proper 
context. This probability information represents how probable it is that word is 
associated with a context, and thus represents the correlation information between the 
word and the context.). 

54. Consider claim 53, Funk and Kennewick teach the audio device according to 
claim 49. Kennewick further teaches wherein the specifying means holds information 
which associates words with one or more categories, and specifies a content of the 
speech uttered by the utterer based on a category in which a plurality of words specified 
by the speech recognition means are commonly classified (the system associates 
keywords with contexts, using keyword matching; 0160. In this example, the word 
"temperature" is associated with two different contexts of "weather" and "measurement." 
Parser uses prior probability, which must be stored to be used, to determine the proper 
context. This probability represents how commonly the word is associated with a 
context). 
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55. Consider claim 54, Funk and Kennewick teach the audio device according to 
claim 49, wherein the specifying means holds a plurality of words assigned to respective 
processes of the process execution means (Funk, 0019-0020, keywords are associated 
with various commands), and performs a corresponding process when at least one of 
the words specified by the speech recognition means is a word assigned to the process 
(Funk, 00190-0020, commands are executed based on received keywords.). 

56. Consider claim 55, the current combination of Funk and Kennewick teach the 
audio device according to claim 49, but does not specifically teach when a meaning of 
an input speech is not discriminatable, the specifying means prompts an input in a more 
discriminatable expression. 

However, Kennewick further teaches teach when a meaning of an input speech 
is not discriminatable, the specifying means prompts an input in a more discriminatable 
expression (00161 , system can question the user to verify question and allow them to 
rephrase to remove ambiguity.). 

Therefore it would have been obvious to one of ordinary skill in the art at the time 
of the invention to allow a user to clarify a question or a command as further taught by 
Kennewick, in the system of Funk and Kennewick in order to allow the system to insure 
an accurate response when the confidence level of a correct understanding is not high 
(Kennewick 0161). 
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57. Consider claim 56, Funk and Kennewick teaches the audio device according to 
claim 49, further comprising information acquisition means which acquires information 
from an external device (Funk, 0019 and 0026, system may access information such as 
stock reports and weather from a voice portal server), and wherein the specifying 
means selects an output content to be output based on the information acquired by the 
information acquisition means (Funk, 0019, keyword command may also be used to 
access information from information accessing device either through text or audible 
format.). 

58. Consider claim 57, Funk teaches an audio device (abstract, voice communicator) 
comprising: 

speech recognition means which acquires speech data representing a speech 
and specifies words represented by the speech by performing speech recognition on 
the speech data (paragraph 0018, speech recognition is provided for command and 
control); and 

process execution means which specifies a content of a speech process to be 
performed based on the specified content (paragraphs 0019-0020, verbal command 
keywords result is the mobile unit performing different operations, including retrieving 
information, which may be presented in audible form through speech. Also see 0031 for 
an example of a speech dialog), 
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information acquisition means which acquires information via predetermined 
communication means (0019 and 0026, and figure 2, system may access information 
such as stock reports and weather from a voice portal server via a call); and 

speech output means which outputs a speech based on the information acquired 
by the information acquisition means (0019, stock and weather information may be 
output to the driver in an audible format, which would be speech), 

whereby when the control specified by the process specifying means is to output 
information acquired by the information acquisition means, the speech output means 
outputs a speech based on the information (0019, keyword command may be used to 
access information from information accessing device either through text or audible 
format. 0031 provides an example of the spoken dialog). 

Funk does not specifically teach specifying means which specifies a content of 
the speech uttered by an utterer based on the words specified by the speech 
recognition means. 

In the same field of speech control, Kennewick teaches specifying means which 
specifies a content of the speech uttered by an utterer based on the words specified by 
the speech recognition means (paragraphs 0160-0161, speech tokens are passed to a 
parser to determine the context and domain of a user's command. Domain and context 
correspond to what the user is actually asking for.). 

Therefore it would have been obvious to one of ordinary skill in the art at the time 
of the invention use the speech parser as taught by Kennewick at the output of the 
speech recognition in the system of Funk in order to allow the system to better 
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understand natural language queries, and reduce the need for the use of keywords 
(Kennewick 0006-0007). 

59. Consider claim 58, Funk teaches a device control method (abstract) comprising: 

a speech recognition step of acquiring speech data representing a speech and 
specifying words represented by the speech by performing speech recognition on the 
speech data ( paragraph 0018, speech recognition is provided for command and 
control); and 

a process execution step of specifying a content of control to be performed on an 
external device to be a control target based on the specified content, and performing the 
control (paragraphs 0019-0020, verbal command keywords result is the mobile unit 
performing different operations, such as retrieving information or controlling radio 
functions). 

Funk does not specifically teach a specifying execution step specifying a content 
of the speech uttered by an utterer based on the words specified by the speech 
recognition step. 

In the same field of speech control, Kennewick teaches specifying step of 
specifying a content of the speech uttered by an utterer based on the words specified by 
the speech recognition step (paragraphs 0160-0161, speech tokens are passed to a 
parser to determine the context and domain of a user's command. Domain and context 
correspond to what the user is actually asking for.). 



Application/Control Number: 10/584,360 Page 41 

Art Unit: 2626 

Therefore it would have been obvious to one of ordinary skill in the art at the time 
of the invention use the speech parser as taught by Kennewick at the output of the 
speech recognition in the method of Funk in order to allow the system to better 
understand natural language queries, and reduce the need for the use of keywords 
(Kennewick 0006-0007). 

60. Consider claim 59, Funk teaches a device control method (abstract) comprising: 

a speech recognition step of acquiring speech data representing a speech and 
specifying words represented by the speech by performing speech recognition on the 
speech data (paragraph 0018, speech is acquired and applied to speech recognition is 
provided for command and control); 

a process specifying step of specifying a content of control to be performed on an 
external device to be a control target based on the specified content (paragraphs 0019- 
0020, verbal command keywords result is the mobile unit performing different 
operations, such as retrieving information or controlling radio functions); 

an information acquisition step of acquiring information via predetermined 
communication device (0019 and 0026, and figure 2, system may access information 
such as stock reports and weather from a voice portal server via a call); and 

a speech output step of outputting a speech based on the information acquired 
by the information acquisition step (0019, stock and weather information may be output 
to the driver in an audible format, which would be speech), 
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whereby when the control specified by the process specifying step is to output 
information acquired by the information acquisition step, the speech output step outputs 
a speech based on the information (, 0019, keyword command may be used to access 
information from information accessing device either through text or audible format. 
0031 provides an example of the spoken dialog). 

Funk does not specifically teach a specifying step of specifying a content of the 
speech uttered by an utterer based on the words specified by the speech recognition 
step. 

In the same field of speech control, Kennewick teaches a specifying step of 
specifying a content of the speech uttered by an utterer based on the words specified by 
the speech recognition step (paragraphs 0160-0161, speech tokens are passed to a 
parser to determine the context and domain of a user's command. Domain and context 
correspond to what the user is actually asking for.). 

Therefore it would have been obvious to one of ordinary skill in the art at the time 
of the invention use the speech parser as taught by Kennewick at the output of the 
speech recognition in the system of Funk in order to allow the system to better 
understand natural language queries, and reduce the need for the use of keywords 
(Kennewick 0006-0007). 

61 . Consider claim 60, Funk teaches a speech recognition method (abstract) 
comprising: 
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a speech recognition step of acquiring speech data representing a speech and 
specifying words represented by the speech by performing speech recognition on the 
speech data ( paragraph 0018, speech recognition is provided for command and 
control); and 

a process execution step of specifying a process to be performed based on the 
specified content, and performing the process (paragraphs 0019-0020, verbal command 
keywords result is the mobile unit performing different operations, such as retrieving 
information or controlling radio functions). 

Funk does not specifically teach a specifying step of specifying a content of the 
speech uttered by an utterer based on the words specified by the speech recognition 
means. 

In the same field of speech control, Kennewick teaches a specifying step of 
specifying a content of the speech uttered by an utterer based on the words specified by 
the speech recognition step (paragraphs 0160-0161, speech tokens are passed to a 
parser to determine the context and domain of a user's command. Domain and context 
correspond to what the user is actually asking for.). 

Therefore it would have been obvious to one of ordinary skill in the art at the time 
of the invention use the speech parser as taught by Kennewick at the output of the 
speech recognition in the system of Funk in order to allow the system to better 
understand natural language queries, and reduce the need for the use of keywords 
(Kennewick 0006-0007). 
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62. Consider claim 61 , Funk teaches a speech recognition method (abstract) 
comprising: 

a speech recognition step of acquiring speech data representing a speech and 
specifying words represented by the speech by performing speech recognition on the 
speech data (paragraph 0018, speech is acquired and applied to speech recognition is 
provided for command and control); 

a process specifying step of specifying a process to be performed based on the 
specified content (paragraphs 0019-0020, verbal command keywords result is the 
mobile unit performing different operations, such as retrieving information or controlling 
radio functions); 

an information acquisition step of acquiring information via predetermined 
communication means (0019 and 0026, and figure 2, system may access information 
such as stock reports and weather from a voice portal server via a call); and 

a speech output step of outputing a speech based on the information acquired by 
the information acquisition means (0019, stock and weather information may be output 
to the driver in an audible format, which would be speech), 

whereby when the control specified by the process specifying step is to output 
information acquired by the information acquisition step, the speech output step outputs 
a speech based on the information (, 0019, keyword command may be used to access 
information from information accessing device either through text or audible format. 
0031 provides an example of the spoken dialog). 
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Funk does not specifically teach a specifying step of specifies a content of the 
speech uttered by an utterer based on the words specified by the speech recognition 
step. 

In the same field of speech control, Kennewick teaches a specifying step of 
specifying a content of the speech uttered by an utterer based on the words specified by 
the speech recognition step (paragraphs 0160-0161, speech tokens are passed to a 
parser to determine the context and domain of a user's command. Domain and context 
correspond to what the user is actually asking for.). 

Therefore it would have been obvious to one of ordinary skill in the art at the time 
of the invention use the speech parser as taught by Kennewick at the output of the 
speech recognition in the system of Funk in order to allow the system to better 
understand natural language queries, and reduce the need for the use of keywords 
(Kennewick 0006-0007). 

63. Consider claim 62, Funk teaches an agent processing method (abstract, provides 
data to user) comprising: 

a speech recognition step of acquiring speech data representing a speech and 
specifying words represented by the speech by performing speech recognition on the 
speech data ( paragraph 0018, speech recognition is provided for command and 
control); and 

a process execution step of specifying a process to be performed based on the 
specified content, and performing the process (paragraphs 0019-0020, verbal command 
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keywords result is the mobile unit performing different operations, such as retrieving 
information or controlling radio functions). 

Funk does not specifically teach a specifying step of specifying a content of the 
speech uttered by an utterer based on the words specified by the speech recognition 
step. 

In the same field of speech control, Kennewick teaches a specifying step of 
specifying a content of the speech uttered by an utterer based on the words specified by 
the speech recognition step (paragraphs 0160-0161, speech tokens are passed to a 
parser to determine the context and domain of a user's command. Domain and context 
correspond to what the user is actually asking for.). 

Therefore it would have been obvious to one of ordinary skill in the art at the time 
of the invention use the speech parser as taught by Kennewick at the output of the 
speech recognition in the system of Funk in order to allow the system to better 
understand natural language queries, and reduce the need for the use of keywords 
(Kennewick 0006-0007). 

64. Consider claim 63, Funk teaches an agent processing method (abstract, provides 
data to user) comprising: 

a speech recognition step of acquiring speech data representing a speech and 
specifying words represented by the speech by performing speech recognition on the 
speech data (paragraph 0018, speech is acquired and applied to speech recognition is 
provided for command and control); 
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a process specifying step of specifying a process to be performed based on the 
specified content (paragraphs 0019-0020, verbal command keywords result is the 
mobile unit performing different operations, such as retrieving information or controlling 
radio functions); 

an information acquisition step of acquiring information via predetermined 
communication means (0019 and 0026, and figure 2, system may access information 
such as stock reports and weather from a voice portal server via a call); and 

a speech output step of outputting a speech based on the information acquired 
by the information acquisition means (0019, stock and weather information may be 
output to the driver in an audible format, which would be speech), 

whereby when the control specified by the process specifying step is to output 
information acquired by the information acquisition means, the speech output step 
outputs a speech based on the information (, 0019, keyword command may be used to 
access information from information accessing device either through text or audible 
format. 0031 provides an example of the spoken dialog). 

Funk does not specifically teach a specifying step of specifying a content of the 
speech uttered by an utterer based on the words specified by the speech recognition 
step. 

In the same field of speech control, Kennewick teaches a specifying step of 
specifies a content of the speech uttered by an utterer based on the words specified by 
the speech recognition step (paragraphs 0160-0161, speech tokens are passed to a 
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parser to determine the context and domain of a user's command. Domain and context 
correspond to what the user is actually asking for.). 

Therefore it would have been obvious to one of ordinary skill in the art at the time 
of the invention use the speech parser as taught by Kennewick at the output of the 
speech recognition in the system of Funk in order to allow the system to better 
understand natural language queries, and reduce the need for the use of keywords 
(Kennewick 0006-0007). 

65. Consider claim 64, Funk teaches an on-vehicle control method for controlling an 
on vehicle device mounted on a vehicle (abstract, on board device, figure 6, control 
screen 25) comprising: 

a speech recognition step of acquiring speech data representing a speech and 
specifying words represented by the speech by performing speech recognition on the 
speech data ( paragraph 0018, speech recognition is provided for command and 
control); and 

a process execution step of specifying a content of control to be performed on 
the on-vehicle device to be a control target based on the specified content, and 
performing the control (paragraphs 0019-0020, verbal command keywords result is the 
mobile unit performing different operations, such as retrieving information or controlling 
radio functions). 
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Funk does not specifically teach a specifying step of specifying a content of the 
speech uttered by an utterer based on the words specified by the speech recognition 
step. 

In the same field of speech control, Kennewick teaches a specifying step of 
specifiying a content of the speech uttered by an utterer based on the words specified 
by the speech recognition step (paragraphs 0160-0161 , speech tokens are passed to a 
parser to determine the context and domain of a user's command. Domain and context 
correspond to what the user is actually asking for.). 

Therefore it would have been obvious to one of ordinary skill in the art at the time 
of the invention use the speech parser as taught by Kennewick at the output of the 
speech recognition in the system of Funk in order to allow the system to better 
understand natural language queries, and reduce the need for the use of keywords 
(Kennewick 0006-0007). 

66. Consider claim 65, Funk teaches an on-vehicle control method for controlling an 
on vehicle device mounted on a vehicle (abstract, on board device, figure 6, control 
screen 25) comprising: 

a speech recognition step of acquiring speech data representing a speech and 
specifying words represented by the speech by performing speech recognition on the 
speech data (paragraph 0018, speech is acquired and applied to speech recognition is 
provided for command and control); 
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a process specifying step of specifying a content of control to be performed on 
the on-vehicle device to be a control target based on the specified content (paragraphs 
0019-0020, verbal command keywords result is the mobile unit performing different 
operations, such as retrieving information or controlling radio functions); 

an information acquisition step of acquiring information via predetermined 
communication device (0019 and 0026, and figure 2, system may access information 
such as stock reports and weather from a voice portal server via a call); and 

a speech output step of outputting a speech based on the information acquired 
by the information acquisition step (0019, stock and weather information may be output 
to the driver in an audible format, which would be speech), 

whereby when the control specified by the process specifying steps is to output 
information acquired by the information acquisition step, the speech output step outputs 
a speech based on the information (0019, keyword command may be used to access 
information from information accessing device either through text or audible format. 
0031 provides an example of the spoken dialog). 

Funk does not a specifically teach specifying step of specifies a content of the 
speech uttered by an utterer based on the words specified by the speech recognition 
means. 

In the same field of speech control, Kennewick teaches s a pecifying step of 
specifiying a content of the speech uttered by an utterer based on the words specified 
by the speech recognition step (paragraphs 0160-0161 , speech tokens are passed to a 
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parser to determine the context and domain of a user's command. Domain and context 
correspond to what the user is actually asking for.). 

Therefore it would have been obvious to one of ordinary skill in the art at the time 
of the invention use the speech parser as taught by Kennewick at the output of the 
speech recognition in the system of Funk in order to allow the system to better 
understand natural language queries, and reduce the need for the use of keywords 
(Kennewick 0006-0007). 

67. Consider claim 66, Funk teaches a navigation method (abstract, figure 5 unit 25) 
comprising: 

a speech recognition step of acquiring speech data representing a speech and 
specifying words represented by the speech by performing speech recognition on the 
speech data ( paragraph 0018, speech recognition is provided for command and 
control); and 

a process execution step of specifying a navigation process to be performed 
based on the specified content, and performing the navigation process (paragraphs 
0019-0020, verbal command keywords result is the mobile unit performing different 
operations, navigation control functions). 

Funk does not specifically teach a specifying step of specifying a content of the 
speech uttered by an utterer based on the words specified by the speech recognition 
step. 



Application/Control Number: 10/584,360 Page 52 

Art Unit: 2626 

In the same field of speech control, Kennewick teaches a specifying step of 
specifying a content of the speech uttered by an utterer based on the words specified by 
the speech recognition step (paragraphs 0160-0161, speech tokens are passed to a 
parser to determine the context and domain of a user's command. Domain and context 
correspond to what the user is actually asking for.). 

Therefore it would have been obvious to one of ordinary skill in the art at the time 
of the invention use the speech parser as taught by Kennewick at the output of the 
speech recognition in the system of Funk in order to allow the system to better 
understand natural language queries, and reduce the need for the use of keywords 
(Kennewick 0006-0007). 

68. Consider claim 67, Funk teaches a navigation method (abstract, on board device, 
figure 6, control screen 25) comprising: 

a speech recognition step of acquiring speech data representing a speech and 
specifying words represented by the speech by performing speech recognition on the 
speech data (paragraph 0018, speech is acquired and applied to speech recognition is 
provided for command and control); 

a process specifying step of specifying a content of a navigation process to be 
performed based on the specified content (paragraphs 0019-0020, verbal command 
keywords result is the mobile unit performing different operations, such as navigation 
functions); 
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an information acquisition step of acquiring information via predetermined 
communication means (0019 and 0026, and figure 2, system may access information 
such as stock reports and weather from a voice portal server via a call); and 

a speech output step of outputting a speech based on the information acquired 
by the information acquisition step (0019, stock and weather information may be output 
to the driver in an audible format, which would be speech), 

whereby when the control specified by the process specifying step is to output 
information acquired by the information acquisition step, the speech output step outputs 
a speech based on the information (0019, keyword command may be used to access 
information from information accessing device either through text or audible format. 
0031 provides an example of the spoken dialog). 

Funk does not specifically teach a specifying step of specifying a content of the 
speech uttered by an utterer based on the words specified by the speech recognition 
step. 

In the same field of speech control, Kennewick teaches a specifying step of 
specifying a content of the speech uttered by an utterer based on the words specified by 
the speech recognition step (paragraphs 0160-0161, speech tokens are passed to a 
parser to determine the context and domain of a user's command. Domain and context 
correspond to what the user is actually asking for.). 

Therefore it would have been obvious to one of ordinary skill in the art at the time 
of the invention use the speech parser as taught by Kennewick at the output of the 
speech recognition in the system of Funk in order to allow the system to better 
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understand natural language queries, and reduce the need for the use of keywords 
(Kennewick 0006-0007). 

69. Consider claim 68, Funk teaches an audio device control method (abstract, voice 
communicator) comprising: 

a speech recognition step of acquiring speech data representing a speech and 
specifying words represented by the speech by performing speech recognition on the 
speech data (paragraph 0018, speech recognition is provided for command and 
control); and 

a process execution step of specifying a content of a speech process to be 
performed based on the specified content, and performing the speech process, or 
controling an external device in such a way as to cause the external device to perform 
the speech process (paragraphs 0019-0020, verbal command keywords result is the 
mobile unit performing different operations, including retrieving information, which may 
be presented in audible form through speech. Also see 0031 for an example of a 
speech dialog). 

Funk does not specifically teach a specifying step of specifying a content of the 
speech uttered by an utterer based on the words specified by the speech recognition 
step. 

In the same field of speech control, Kennewick teaches a specifying step of 
specifying a content of the speech uttered by an utterer based on the words specified by 
the speech recognition step (paragraphs 0160-0161, speech tokens are passed to a 
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parser to determine the context and domain of a user's command. Domain and context 
correspond to what the user is actually asking for.). 

Therefore it would have been obvious to one of ordinary skill in the art at the time 
of the invention use the speech parser as taught by Kennewick at the output of the 
speech recognition in the system of Funk in order to allow the system to better 
understand natural language queries, and reduce the need for the use of keywords 
(Kennewick 0006-0007). 

70. Consider claim 69, Funk teaches an audio device control method (abstract, voice 
communicator) comprising: 

a speech recognition step of acquiring speech data representing a speech and 
specifying words represented by the speech by performing speech recognition on the 
speech data (paragraph 0018, speech recognition is provided for command and 
control); and 

a process execution step of specifying a content of a speech process to be 
performed by an external audio device based on the specified content (paragraphs 
0019-0020, verbal command keywords result is the mobile unit performing different 
operations, including retrieving information, which may be presented in audible form 
through speech. Also see 0031 for an example of a speech dialog), 

an information acquisition step of acquiring information via predetermined 
communication device (0019 and 0026, and figure 2, system may access information 
such as stock reports and weather from a voice portal server via a call); and 
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a speech output step of outputs a speech based on the information acquired by 
the information acquisition step (0019, stock and weather information may be output to 
the driver in an audible format, which would be speech), 

whereby when the control specified by the process specifying step is to output 
information acquired by the information acquisition step, the speech output step outputs 
a speech based on the information (0019, keyword command may be used to access 
information from information accessing device either through text or audible format. 
0031 provides an example of the spoken dialog). 

Funk does not specifically teach a specifying step of specifying a content of the 
speech uttered by an utterer based on the words specified by the speech recognition 
step. 

In the same field of speech control, Kennewick teaches a specifying step of 
specifying a content of the speech uttered by an utterer based on the words specified by 
the speech recognition step (paragraphs 0160-0161, speech tokens are passed to a 
parser to determine the context and domain of a user's command. Domain and context 
correspond to what the user is actually asking for.). 

Therefore it would have been obvious to one of ordinary skill in the art at the time 
of the invention use the speech parser as taught by Kennewick at the output of the 
speech recognition in the system of Funk in order to allow the system to better 
understand natural language queries, and reduce the need for the use of keywords 
(Kennewick 0006-0007). 
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71 . Consider claim 70, Funk teaches a program which allows a computer to function 
as a device control device (abstract; 0018 discussing implementing using software) 
comprising: 

speech recognition means which acquires speech data representing a speech 
and specifies words represented by the speech by performing speech recognition on 
the speech data ( paragraph 0018, speech recognition is provided for command and 
control); and 

process execution means which specifies a content of control to be performed on 
an external device to be a control target based on the specified content, and performs 
the control (paragraphs 0019-0020, verbal command keywords result is the mobile unit 
performing different operations, such as retrieving information or controlling radio 
functions). 

Funk does not specifically teach specifying means which specifies a content of 
the speech uttered by an utterer based on the words specified by the speech 
recognition means. 

In the same field of speech control, Kennewick teaches specifying means which 
specifies a content of the speech uttered by an utterer based on the words specified by 
the speech recognition means (paragraphs 0160-0161, speech tokens are passed to a 
parser to determine the context and domain of a user's command. Domain and context 
correspond to what the user is actually asking for.). 

Therefore it would have been obvious to one of ordinary skill in the art at the time 
of the invention use the speech parser as taught by Kennewick at the output of the 
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speech recognition in the system of Funk in order to allow the system to better 
understand natural language queries, and reduce the need for the use of keywords 
(Kennewick 0006-0007). 

72. Consider claim 71 , Funk teaches a program which allows a computer to function 
as a device control device (abstract; 0018 discussing implementing using software) 
comprising: 

speech recognition means which acquires speech data representing a speech 
and specifies words represented by the speech by performing speech recognition on 
the speech data (paragraph 0018, speech is acquired and applied to speech recognition 
is provided for command and control); 

process specifying means which specifies a content of control to be performed 
on an external device to be a control target based on the specified content (paragraphs 
0019-0020, verbal command keywords result is the mobile unit performing different 
operations, such as retrieving information or controlling radio functions); 

information acquisition means which acquires information via predetermined 
communication means (0019 and 0026, and figure 2, system may access information 
such as stock reports and weather from a voice portal server via a call); and 

speech output means which outputs a speech based on the information acquired 
by the information acquisition means (0019, stock and weather information may be 
output to the driver in an audible format, which would be speech), 
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whereby when the control specified by the process specifying means is to output 
information acquired by the information acquisition means, the speech output means 
outputs a speech based on the information (, 0019, keyword command may be used to 
access information from information accessing device either through text or audible 
format. 0031 provides an example of the spoken dialog). 

Funk does not specifically teach specifying means which specifies a content of 
the speech uttered by an utterer based on the words specified by the speech 
recognition means. 

In the same field of speech control, Kennewick teaches specifying means which 
specifies a content of the speech uttered by an utterer based on the words specified by 
the speech recognition means (paragraphs 0160-0161 , speech tokens are passed to a 
parser to determine the context and domain of a user's command. Domain and context 
correspond to what the user is actually asking for.). 

Therefore it would have been obvious to one of ordinary skill in the art at the time 
of the invention use the speech parser as taught by Kennewick at the output of the 
speech recognition in the system of Funk in order to allow the system to better 
understand natural language queries, and reduce the need for the use of keywords 
(Kennewick 0006-0007). 

73. Consider claim 72, Funk teaches a program which allows a computer to function 
as a speech recognition device (abstract; 0018 discussing implementing using software) 
comprising: 
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speech recognition means which acquires speech data representing a speech 
and specifies words represented by the speech by performing speech recognition on 
the speech data ( paragraph 0018, speech recognition is provided for command and 
control); and 

process execution means which specifies a process to be performed based on 
the specified content, and performs the process (paragraphs 0019-0020, verbal 
command keywords result is the mobile unit performing different operations, such as 
retrieving information or controlling radio functions). 

Funk does not specifically teach specifying means which specifies a content of 
the speech uttered by an utterer based on the words specified by the speech 
recognition means. 

In the same field of speech control, Kennewick teaches specifying means which 
specifies a content of the speech uttered by an utterer based on the words specified by 
the speech recognition means (paragraphs 0160-0161, speech tokens are passed to a 
parser to determine the context and domain of a user's command. Domain and context 
correspond to what the user is actually asking for.). 

Therefore it would have been obvious to one of ordinary skill in the art at the time 
of the invention use the speech parser as taught by Kennewick at the output of the 
speech recognition in the system of Funk in order to allow the system to better 
understand natural language queries, and reduce the need for the use of keywords 
(Kennewick 0006-0007). 
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74. Consider claim 73, Funk teaches a program which allows a computer to function 
as a speech recognition device (abstract; 0018 discussing implementing using software) 
comprising: 

speech recognition means which acquires speech data representing a speech 
and specifies words represented by the speech by performing speech recognition on 
the speech data (paragraph 0018, speech is acquired and applied to speech recognition 
is provided for command and control); 

process specifying means which specifies a process to be performed based on 
the specified content (paragraphs 0019-0020, verbal command keywords result is the 
mobile unit performing different operations, such as retrieving information or controlling 
radio functions); 

information acquisition means which acquires information via predetermined 
communication means (0019 and 0026, and figure 2, system may access information 
such as stock reports and weather from a voice portal server via a call); and 

speech output means which outputs a speech based on the information acquired 
by the information acquisition means (0019, stock and weather information may be 
output to the driver in an audible format, which would be speech), 

whereby when the control specified by the process specifying means is to output 
information acquired by the information acquisition means, the speech output means 
outputs a speech based on the information (, 0019, keyword command may be used to 
access information from information accessing device either through text or audible 
format. 0031 provides an example of the spoken dialog). 
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Funk does not specifically teach specifying means which specifies a content of 
the speech uttered by an utterer based on the words specified by the speech 
recognition means. 

In the same field of speech control, Kennewick teaches specifying means which 
specifies a content of the speech uttered by an utterer based on the words specified by 
the speech recognition means (paragraphs 0160-0161, speech tokens are passed to a 
parser to determine the context and domain of a user's command. Domain and context 
correspond to what the user is actually asking for.). 

Therefore it would have been obvious to one of ordinary skill in the art at the time 
of the invention use the speech parser as taught by Kennewick at the output of the 
speech recognition in the system of Funk in order to allow the system to better 
understand natural language queries, and reduce the need for the use of keywords 
(Kennewick 0006-0007). 

75. Consider claim 74, Funk teaches a program which allows a computer to function 
as an agent device (abstract provides data to user; 0018 discussing implementing using 
software) comprising: 

speech recognition means which acquires speech data representing a speech 
and specifies words represented by the speech by performing speech recognition on 
the speech data ( paragraph 0018, speech recognition is provided for command and 
control); and 
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process execution means which specifies a process to be performed based on 
the specified content, and performs the process (paragraphs 0019-0020, verbal 
command keywords result is the mobile unit performing different operations, such as 
retrieving information or controlling radio functions). 

Funk does not specifically teach specifying means which specifies a content of 
the speech uttered by an utterer based on the words specified by the speech 
recognition means. 

In the same field of speech control, Kennewick teaches specifying means which 
specifies a content of the speech uttered by an utterer based on the words specified by 
the speech recognition means (paragraphs 0160-0161, speech tokens are passed to a 
parser to determine the context and domain of a user's command. Domain and context 
correspond to what the user is actually asking for.). 

Therefore it would have been obvious to one of ordinary skill in the art at the time 
of the invention use the speech parser as taught by Kennewick at the output of the 
speech recognition in the system of Funk in order to allow the system to better 
understand natural language queries, and reduce the need for the use of keywords 
(Kennewick 0006-0007). 

76. Consider claim 75, Funk teaches a program which allows a computer to function 
as an agent device (abstract provides data to user; 0018 discussing implementing using 
software) comprising: 
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speech recognition means which acquires speech data representing a speech 
and specifies words represented by the speech by performing speech recognition on 
the speech data (paragraph 0018, speech is acquired and applied to speech recognition 
is provided for command and control); 

process specifying means which specifies a process to be performed based on 
the specified content (paragraphs 0019-0020, verbal command keywords result is the 
mobile unit performing different operations, such as retrieving information or controlling 
radio functions); 

information acquisition means which acquires information via predetermined 
communication means (0019 and 0026, and figure 2, system may access information 
such as stock reports and weather from a voice portal server via a call); and 

speech output means which outputs a speech based on the information acquired 
by the information acquisition means (0019, stock and weather information may be 
output to the driver in an audible format, which would be speech), 

whereby when the control specified by the process specifying means is to output 
information acquired by the information acquisition means, the speech output means 
outputs a speech based on the information (, 0019, keyword command may be used to 
access information from information accessing device either through text or audible 
format. 0031 provides an example of the spoken dialog). 

Funk does not specifically teach specifying means which specifies a content of 
the speech uttered by an utterer based on the words specified by the speech 
recognition means. 
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In the same field of speech control, Kennewick teaches specifying means which 
specifies a content of the speech uttered by an utterer based on the words specified by 
the speech recognition means (paragraphs 0160-0161 , speech tokens are passed to a 
parser to determine the context and domain of a user's command. Domain and context 
correspond to what the user is actually asking for.). 

Therefore it would have been obvious to one of ordinary skill in the art at the time 
of the invention use the speech parser as taught by Kennewick at the output of the 
speech recognition in the system of Funk in order to allow the system to better 
understand natural language queries, and reduce the need for the use of keywords 
(Kennewick 0006-0007). 

77. Consider claim 76, Funk teaches a program which allows a computer to function 
as an on-vehicle control device so constructed as to be mountable on a vehicle having 
an external on-vehicle device mounted thereon (abstract on board device, figure 6, 
control screen 25; 0018 discussing implementing using software) comprising: 

speech recognition means which acquires speech data representing a speech 
and specifies words represented by the speech by performing speech recognition on 
the speech data ( paragraph 0018, speech recognition is provided for command and 
control); and 

process execution means which specifies a content of control to be performed on 
the on-vehicle device to be a control target based on the specified content, and 
performs the control (paragraphs 0019-0020, verbal command keywords result is the 
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mobile unit performing different operations, such as retrieving information or controlling 
radio functions). 

Funk does not specifically teach specifying means which specifies a content of 
the speech uttered by an utterer based on the words specified by the speech 
recognition means. 

In the same field of speech control, Kennewick teaches specifying means which 
specifies a content of the speech uttered by an utterer based on the words specified by 
the speech recognition means (paragraphs 0160-0161, speech tokens are passed to a 
parser to determine the context and domain of a user's command. Domain and context 
correspond to what the user is actually asking for.). 

Therefore it would have been obvious to one of ordinary skill in the art at the time 
of the invention use the speech parser as taught by Kennewick at the output of the 
speech recognition in the system of Funk in order to allow the system to better 
understand natural language queries, and reduce the need for the use of keywords 
(Kennewick 0006-0007). 

78. Consider claim 77, Funk teaches a program which allows a computer to function 
as an on-vehicle control device so constructed as to be mountable on a vehicle having 
an external on-vehicle device mounted thereon (abstract on board device, figure 6, 
control screen 25; 0018 discussing implementing using software), comprising: 

speech recognition means which acquires speech data representing a speech 
and specifies words represented by the speech by performing speech recognition on 
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the speech data (paragraph 0018, speech is acquired and applied to speech recognition 
is provided for command and control); 

process specifying means which specifies a content of control to be performed 
on the on-vehicle device to be a control target based on the specified content 
(paragraphs 0019-0020, verbal command keywords result is the mobile unit performing 
different operations, such as retrieving information or controlling radio functions); 

information acquisition means which acquires information via predetermined 
communication means (0019 and 0026, and figure 2, system may access information 
such as stock reports and weather from a voice portal server via a call); and 

speech output means which outputs a speech based on the information acquired 
by the information acquisition means (0019, stock and weather information may be 
output to the driver in an audible format, which would be speech), 

whereby when the control specified by the process specifying means is to output 
information acquired by the information acquisition means, the speech output means 
outputs a speech based on the information (0019, keyword command may be used to 
access information from information accessing device either through text or audible 
format. 0031 provides an example of the spoken dialog). 

Funk does not specifically teach specifying means which specifies a content of 
the speech uttered by an utterer based on the words specified by the speech 
recognition means. 

In the same field of speech control, Kennewick teaches specifying means which 
specifies a content of the speech uttered by an utterer based on the words specified by 
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the speech recognition means (paragraphs 0160-0161, speech tokens are passed to a 
parser to determine the context and domain of a user's command. Domain and context 
correspond to what the user is actually asking for.). 

Therefore it would have been obvious to one of ordinary skill in the art at the time 
of the invention use the speech parser as taught by Kennewick at the output of the 
speech recognition in the system of Funk in order to allow the system to better 
understand natural language queries, and reduce the need for the use of keywords 
(Kennewick 0006-0007). 

79. Consider claim 78, Funk teaches a program which allows a computer to function 
as a navigation device so constructed to be mountable on a vehicle (abstract, figure 5 
unit 25; 0018 discussing implementing using software) comprising: 

speech recognition means which acquires speech data representing a speech 
and specifies words represented by the speech by performing speech recognition on 
the speech data ( paragraph 0018, speech recognition is provided for command and 
control); and 

process execution means which specifies a navigation process to be performed 
based on the specified content, and performs the navigation process (paragraphs 0019- 
0020, verbal command keywords result is the mobile unit performing different 
operations, navigation control functions). 
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Funk does not specifically teach specifying means which specifies a content of 
the speech uttered by an utterer based on the words specified by the speech 
recognition means. 

In the same field of speech control, Kennewick teaches specifying means which 
specifies a content of the speech uttered by an utterer based on the words specified by 
the speech recognition means (paragraphs 0160-0161, speech tokens are passed to a 
parser to determine the context and domain of a user's command. Domain and context 
correspond to what the user is actually asking for.). 

Therefore it would have been obvious to one of ordinary skill in the art at the time 
of the invention use the speech parser as taught by Kennewick at the output of the 
speech recognition in the system of Funk in order to allow the system to better 
understand natural language queries, and reduce the need for the use of keywords 
(Kennewick 0006-0007). 

80. Consider claim 79, Funk teaches a program which allows a computer to function 
as a navigation device so constructed to be mountable on a vehicle (abstract, figure 5 
unit 25; 0018 discussing implementing using software) comprising: 

speech recognition means which acquires speech data representing a speech 
and specifies words represented by the speech by performing speech recognition on 
the speech data (paragraph 0018, speech is acquired and applied to speech recognition 
is provided for command and control); 
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process specifying means which specifies a content of a navigation process to be 
performed based on the specified content (paragraphs 0019-0020, verbal command 
keywords result is the mobile unit performing different operations, such as navigation 
functions); 

information acquisition means which acquires information via predetermined 
communication means (0019 and 0026, and figure 2, system may access information 
such as stock reports and weather from a voice portal server via a call); and 

speech output means which outputs a speech based on the information acquired 
by the information acquisition means (0019, stock and weather information may be 
output to the driver in an audible format, which would be speech), 

whereby when the control specified by the process specifying means is to output 
information acquired by the information acquisition means, the speech output means 
outputs a speech based on the information (0019, keyword command may be used to 
access information from information accessing device either through text or audible 
format. 0031 provides an example of the spoken dialog). 

Funk does not specifically teach specifying means which specifies a content of 
the speech uttered by an utterer based on the words specified by the speech 
recognition means. 

In the same field of speech control, Kennewick teaches specifying means which 
specifies a content of the speech uttered by an utterer based on the words specified by 
the speech recognition means (paragraphs 0160-0161, speech tokens are passed to a 
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parser to determine the context and domain of a user's command. Domain and context 
correspond to what the user is actually asking for.). 

Therefore it would have been obvious to one of ordinary skill in the art at the time 
of the invention use the speech parser as taught by Kennewick at the output of the 
speech recognition in the system of Funk in order to allow the system to better 
understand natural language queries, and reduce the need for the use of keywords 
(Kennewick 0006-0007). 

81 . Consider claim 80, Funk teaches a program which allows a computer to function 
as an audio device (abstract, voice communicator; 0018 discussing implementing using 
software) comprising: 

speech recognition means which acquires speech data representing a speech 
and specifies words represented by the speech by performing speech recognition on 
the speech data (paragraph 0018, speech recognition is provided for command and 
control); and 

process execution means which specifies a content of a speech process to be 
performed based on the specified content, and performs the speech process, or 
controls an external device in such a way as to cause the external device to perform the 
speech process (paragraphs 0019-0020, verbal command keywords result is the mobile 
unit performing different operations, including retrieving information, which may be 
presented in audible form through speech. Also see 0031 for an example of a speech 
dialog). 
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Funk does not specifically teach specifying means which specifies a content of 
the speech uttered by an utterer based on the words specified by the speech 
recognition means. 

In the same field of speech control, Kennewick teaches specifying means which 
specifies a content of the speech uttered by an utterer based on the words specified by 
the speech recognition means (paragraphs 0160-0161, speech tokens are passed to a 
parser to determine the context and domain of a user's command. Domain and context 
correspond to what the user is actually asking for.). 

Therefore it would have been obvious to one of ordinary skill in the art at the time 
of the invention use the speech parser as taught by Kennewick at the output of the 
speech recognition in the system of Funk in order to allow the system to better 
understand natural language queries, and reduce the need for the use of keywords 
(Kennewick 0006-0007). 

82. Consider claim 81 , Funk teaches a program which allows a computer to function 
as an audio device (abstract, voice communicator; 0018 discussing implementing using 
software) comprising: 

speech recognition means which acquires speech data representing a speech 
and specifies words represented by the speech by performing speech recognition on 
the speech data (paragraph 0018, speech recognition is provided for command and 
control); and 
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process execution means which specifies a content of a speech process to be 
performed based on the specified content (paragraphs 0019-0020, verbal command 
keywords result is the mobile unit performing different operations, including retrieving 
information, which may be presented in audible form through speech. Also see 0031 for 
an example of a speech dialog), 

information acquisition means which acquires information via predetermined 
communication means (0019 and 0026, and figure 2, system may access information 
such as stock reports and weather from a voice portal server via a call); and 

speech output means which outputs a speech based on the information acquired 
by the information acquisition means (0019, stock and weather information may be 
output to the driver in an audible format, which would be speech), 

whereby when the control specified by the process specifying means is to output 
information acquired by the information acquisition means, the speech output means 
outputs a speech based on the information (0019, keyword command may be used to 
access information from information accessing device either through text or audible 
format. 0031 provides an example of the spoken dialog). 

Funk does not specifically teach specifying means which specifies a content of 
the speech uttered by an utterer based on the words specified by the speech 
recognition means. 

In the same field of speech control, Kennewick teaches specifying means which 
specifies a content of the speech uttered by an utterer based on the words specified by 
the speech recognition means (paragraphs 0160-0161, speech tokens are passed to a 
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parser to determine the context and domain of a user's command. Domain and context 
correspond to what the user is actually asking for.). 

Therefore it would have been obvious to one of ordinary skill in the art at the time 
of the invention use the speech parser as taught by Kennewick at the output of the 
speech recognition in the system of Funk in order to allow the system to better 
understand natural language queries, and reduce the need for the use of keywords 
(Kennewick 0006-0007). 

83. Claims 2, 3, 12, 21, 22, 32, 41, and 50 are rejected under 35 U.S.C. 103(a) as 
being unpatentable over Funk and Kennewick as applied to claims 1 , 1 1 , 20, 31 , 40, 
and 49 above, and further in view of Potter (US Patent 5,729,659). 

84. Consider claim 2, Funk and Kennewick teach the device control device according 
to claim 1 , but does not specifically teach the speech recognition means includes 
speech part specifying means which specifies a part of speech of the specified words, 
and 

the specifying means specifies a content of the speech uttered by the utterer 
based only on those of the words specified by the speech recognition means which are 
specified as a predetermined part of speech. 

In the same field of speech control, Potter teaches speech recognition means 
includes speech part specifying means which specifies a part of speech of the specified 
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words (column 13 line 45-collum 14 line 20, each word in an input is assigned a part of 
speech, and context information is generated ), and 

the specifying means specifies a content of the speech uttered by the utterer 
based only on those of the words specified by the speech recognition means which are 
specified as a predetermined part of speech (column 13 lines 35-45 show how Part of 
Speech information is used to help determine content of an input. Because every word 
is assigned part of speech, only those words assigned are used to specify context, even 
those every word may be used to specify context). 

Therefore it would have been obvious to one of ordinary skill in the art at the time 
of the invention to determine Part of Speech information as taught by Potter in the 
system of Funk and Kennewick in order to allow the system to determine the meaning 
of each word, which may very depending on the part of speech, as some words may 
have multiple parts of speech depending on usage (Potter column 13 lines 45-55). 

85. Consider claim 3, Funk, Kennewick, and Potter teach the device control device 
according to claim 2, wherein the specifying means discriminates whether or not a 
combination of a plurality of words in the words specified by the speech recognition 
means which is specified as a predetermined part of speech ( Potter column 13 lines 
35-45 show how Part of Speech information is used to help determine content of an 
input. Because every word is assigned part of speech, only those words assigned are 
used to specify context, even those every word may be used to specify context) meets a 
predetermined condition (Kennewick, 0160-0161, possible context are scored, and the 
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most likely is determined. In this case, a context being the most likely candidate is the 
predetermined condition for its output.), and specifies a content of the speech uttered by 
the utterer based on a discrimination result (Kennewick, 0160-0161, possible context 
are scored, and the most likely is determined. In this case, a context being the most 
likely candidate is the predetermined condition for its output). 

86. Consider claim 12, Funk and Kennewick teach the speech recognition device 
according to claim 1 1 , but does not specifically teach the speech recognition means 
includes speech part specifying means which specifies a part of speech of the specified 
words, and 

the specifying means specifies a content of the speech uttered by the utterer 
based only on those of the words specified by the speech recognition means which are 
specified as a predetermined part of speech. 

In the same field of speech control, Potter teaches speech recognition means 
includes speech part specifying means which specifies a part of speech of the specified 
words (column 13 line 45-collum 14 line 20, each word in an input is assigned a part of 
speech, and context information is generated ), and 

the specifying means specifies a content of the speech uttered by the utterer 
based only on those of the words specified by the speech recognition means which are 
specified as a predetermined part of speech (column 13 lines 35-45 show how Part of 
Speech information is used to help determine content of an input. Because every word 
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is assigned part of speech, only those words assigned are used to specify context, even 
those every word may be used to specify context). 

Therefore it would have been obvious to one of ordinary skill in the art at the time 
of the invention to determine Part of Speech information as taught by Potter in the 
system of Funk and Kennewick in order to allow the system to determine the meaning 
of each word, which may very depending on the part of speech, as some words may 
have multiple parts of speech depending on usage (Potter column 13 lines 45-55). 

87. Consider claim 21 , Funk and Kennewick teach the agent device according to 
claim 20, but does not specifically teach the speech recognition means includes speech 
part specifying means which specifies a part of speech of the specified words, and 

the specifying means specifies a content of the speech uttered by the utterer 
based only on those of the words specified by the speech recognition means which are 
specified as a predetermined part of speech. 

In the same field of speech control, Potter teaches speech recognition means 
includes speech part specifying means which specifies a part of speech of the specified 
words (column 13 line 45-collum 14 line 20, each word in an input is assigned a part of 
speech, and context information is generated ), and 

the specifying means specifies a content of the speech uttered by the utterer 
based only on those of the words specified by the speech recognition means which are 
specified as a predetermined part of speech (column 13 lines 35-45 show how Part of 
Speech information is used to help determine content of an input. Because every word 
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is assigned part of speech, only those words assigned are used to specify context, even 
those every word may be used to specify context). 

Therefore it would have been obvious to one of ordinary skill in the art at the time 
of the invention to determine Part of Speech information as taught by Potter in the 
system of Funk and Kennewick in order to allow the system to determine the meaning 
of each word, which may very depending on the part of speech, as some words may 
have multiple parts of speech depending on usage (Potter column 13 lines 45-55). 

88. Consider claim 22, Funk, Kennewick, and Potter teach the agent device 
according to claim 21 , wherein the specifying means discriminates whether or not a 
combination of a plurality of words in the words specified by the speech recognition 
means which is specified as a predetermined part of speech ( Potter column 13 lines 
35-45 show how Part of Speech information is used to help determine content of an 
input. Because every word is assigned part of speech, only those words assigned are 
used to specify context, even those every word may be used to specify context) meets a 
predetermined condition (Kennewick, 0160-0161, possible context are scored, and the 
most likely is determined. In this case, a context being the most likely candidate is the 
predetermined condition for its output.), and specifies a content of the speech uttered by 
the utterer based on a discrimination result (Kennewick, 0160-0161, possible context 
are scored, and the most likely is determined. In this case, a context being the most 
likely candidate is the predetermined condition for its output). 
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89. Consider claim 32, Funk and Kennewick teach the on-vehicle control device 
according to claim 31 , but does not specifically teach the speech recognition means 
includes speech part specifying means which specifies a part of speech of the specified 
words, and 

the specifying means specifies a content of the speech uttered by the utterer 
based only on those of the words specified by the speech recognition means which are 
specified as a predetermined part of speech. 

In the same field of speech control, Potter teaches speech recognition means 
includes speech part specifying means which specifies a part of speech of the specified 
words (column 13 line 45-collum 14 line 20, each word in an input is assigned a part of 
speech, and context information is generated ), and 

the specifying means specifies a content of the speech uttered by the utterer 
based only on those of the words specified by the speech recognition means which are 
specified as a predetermined part of speech (column 13 lines 35-45 show how Part of 
Speech information is used to help determine content of an input. Because every word 
is assigned part of speech, only those words assigned are used to specify context, even 
those every word may be used to specify context). 

Therefore it would have been obvious to one of ordinary skill in the art at the time 
of the invention to determine Part of Speech information as taught by Potter in the 
system of Funk and Kennewick in order to allow the system to determine the meaning 
of each word, which may very depending on the part of speech, as some words may 
have multiple parts of speech depending on usage (Potter column 13 lines 45-55). 
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90. Consider claim 41 , Funk and Kennewick teach the navigation device according to 
claim 40, but does not specifically teach the speech recognition means includes speech 
part specifying means which specifies a part of speech of the specified words, and 

the specifying means specifies a content of the speech uttered by the utterer 
based only on those of the words specified by the speech recognition means which are 
specified as a predetermined part of speech. 

In the same field of speech control, Potter teaches speech recognition means 
includes speech part specifying means which specifies a part of speech of the specified 
words (column 13 line 45-collum 14 line 20, each word in an input is assigned a part of 
speech, and context information is generated ), and 

the specifying means specifies a content of the speech uttered by the utterer 
based only on those of the words specified by the speech recognition means which are 
specified as a predetermined part of speech (column 13 lines 35-45 show how Part of 
Speech information is used to help determine content of an input. Because every word 
is assigned part of speech, only those words assigned are used to specify context, even 
those every word may be used to specify context). 

Therefore it would have been obvious to one of ordinary skill in the art at the time 
of the invention to determine Part of Speech information as taught by Potter in the 
system of Funk and Kennewick in order to allow the system to determine the meaning 
of each word, which may very depending on the part of speech, as some words may 
have multiple parts of speech depending on usage (Potter column 13 lines 45-55). 



Application/Control Number: 10/584,360 
Art Unit: 2626 



Page 81 



91 . Consider claim 50, Funk and Kennewick teach the audio device according to 
claim 49, but does not specifically teach the speech recognition means includes speech 
part specifying means which specifies a part of speech of the specified words, and 

the specifying means specifies a content of the speech uttered by the utterer 
based only on those of the words specified by the speech recognition means which are 
specified as a predetermined part of speech. 

In the same field of speech control, Potter teaches speech recognition means 
includes speech part specifying means which specifies a part of speech of the specified 
words (column 13 line 45-collum 14 line 20, each word in an input is assigned a part of 
speech, and context information is generated ), and 

the specifying means specifies a content of the speech uttered by the utterer 
based only on those of the words specified by the speech recognition means which are 
specified as a predetermined part of speech (column 13 lines 35-45 show how Part of 
Speech information is used to help determine content of an input. Because every word 
is assigned part of speech, only those words assigned are used to specify context, even 
those every word may be used to specify context). 

Therefore it would have been obvious to one of ordinary skill in the art at the time 
of the invention to determine Part of Speech information as taught by Potter in the 
system of Funk and Kennewick in order to allow the system to determine the meaning 
of each word, which may very depending on the part of speech, as some words may 
have multiple parts of speech depending on usage (Potter column 13 lines 45-55). 
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Conclusion 

92. The prior art made of record and not relied upon is considered pertinent to 
applicant's disclosure is listed in the Notice of References Cited. 

Any inquiry concerning this communication or earlier communications from the 
examiner should be directed to DOUGLAS C. GODBOLD whose telephone number is 
(571)270-1451 . The examiner can normally be reached on Monday-Thursday 7:00am- 
4:30pm Friday 7:00am-3:30pm. 

If attempts to reach the examiner by telephone are unsuccessful, the examiner's 
supervisor, Richemond Dorvil can be reached on (571) 272-7602. The fax phone 
number for the organization where this application or proceeding is assigned is 571- 
273-8300. 

Information regarding the status of an application may be obtained from the 
Patent Application Information Retrieval (PAIR) system. Status information for 
published applications may be obtained from either Private PAIR or Public PAIR. 
Status information for unpublished applications is available through Private PAIR only. 
For more information about the PAIR system, see http://pair-direct.uspto.gov. Should 
you have questions on access to the Private PAIR system, contact the Electronic 
Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a 
USPTO Customer Service Representative or access to the automated information 
system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000. 
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